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(54) Computer network telephony 

(57) There is disclosed a method and apparatus for 
connecting computer network ip telephones using a 
voice recognition engine and a ip address database on 
an Internet server. The method comprises opening a 
voice channel from an ip phones to a voice recognition 
server; determining the name of the addressee from a 
speech input sent over the voice channel to the voice 
recognition server by the caller; determining an ip 
address from an ip address database corresponding to 
the determined addressee's name; opening a data 
channel from the database and transmitting the ip 



address to one or other of said telephones; and routing 
logic on said one or other ip phones using the ip 
address to establish a connection with the other ip 
phone. This allows the ip phones to access remote 
resources of voice recognition and a large database 
thereby taking advantage of more powerful resources 
that would be available locally. This is particularly 
advantageous for pervasive computing devices which 
have limited resources for storage of ip addresses. 
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Description 
FIELD OF INVENTION 

[0001] This invention relates to computer network 
telephony. In particular it relates to making connections 
between network telephones on a computer network. 

BACKGROUND OF INVENTION 

[0002] Although originally intended for the transmis- 
sion of computer data, more recently computer net- 
works and specifically the Internet has been exploited to 
provide real time telephony communications. The pri- 
mary attraction of the Internet for telephony communi- 
cations is the low charge compared with conventional 
telephony or the plain old telephone system (POTS). 
Many Internet users have a dial-up connection to an 
access provider over a local telephone line, and there- 
fore such users pay only local telephone charges when 
logged on. Some access providers charge a monthly 
description, whilst others charge on the basis of con- 
nection time (some may do both). However, there is 
generally no charge associated with actual data transfer 
over the network. As a result, the effective cost of an 
international call over the Internet may be no more than 
that of a local call of the same duration to the access 
provider. In addition, the fully digital nature of the Inter- 
net may potentially offer a richer functionality (e.g. in 
terms of conference calling) than conventional tele- 
phone networks. Internet phones are surveyed in the 
article "Dial 1-800-lnternet" in Byte Magazine, February 
1996, pages 83-88 and in the article "Nattering On", in 
New Scientist, 2 March 1996, pages 38-40. 
[0003] The transmission of voice signals over a 
packet network is described for example in "Using Local 
Area Networks for Carrying Online Voice" by D. Cohen, 
pages 13-21, in "Voice Transmission over an Ethernet 
Backbone" by P. Ravasio. R. Marcogliese, and R. 
Novarese, pages 39-65, both in "Local Computer Net- 
works" (edited by P. Ravasio, G. Hopkins, and N. Naf- 
fah; North Holland, 1982) and also in GB 2283252. The 
basic principles of such a scheme are that a first com- 
puter digitally samples a voice input signal at a regular 
rate (eg 8 kHz). A number of samples are then assem- 
bled into a data packet for transmission over the net- 
work to a second terminal, which then feeds the 
samples to a loudspeaker or equivalent device for play 
out, again at a constant 8 kHz rate. Voice transmission 
over the Internet is substantially similar to transmission 
over a LAN (which may indeed provide part of the Inter- 
net transmission path), but there tends to be less spare 
bandwidth available on the Internet. As a result, Internet 
phones normally compress the voice signal at the trans- 
mitting end, and then decompress it at the receiving 
end. 

[0004] Voice directories for POTS are known. Wild- 
fire is an "Advanced Voice-Controlled Electronic Assist- 



2 

ant". It has various capabilities, including acting as a 
"voice dialler" - wherein the user can speak a telephone 
number they wish to call into a phone which has a con- 
nection to the Wildfire system - the Wildfire system can 

5 perform a transfer to the telephone number requested. 
Users can also set up to 150 "nicknames" for commonly 
used numbers such as "work", "home", "bill", etc. and 
just ask Wildfire to "call Bill", for example. Wildfire is not 
an IP telephony based product and does not allow for 

to very large numbers of names in a directory. Further- 
more it is an internal company directory which uses a 
private branch switch to make connections. For further 
info, check out nttpyAvww4.wildfire.com. 
[0005] Another POTS voice directory, ViaVoice 

(5 Directory Dialler, prompts callers for a person's name, 
requests further information when duplicate names are 
encountered and transfers the call to the number which 
equates with that person's name. It currently has sup- 
port for up to 250.000 names. It is not an IPteleph- 

so ony based product and uses a private branch switch 
based in the company or internal telephone net- 
work. For further info, see http://www.soft- 
ware.ibm.com/speech/overview/business/direct.html. 
[0006] An ip address is a unique identification and 

25 uses several bytes of memory, more memory to store 
than a nick name or abbreviated address. This can 
cause a problem with thin devices with reduced memory 
capacity. This problem will become more prominent as 
the number of telephony addresses in the world rises at 

30 the current rate. Furthermore in a few years time the 
number of unique address will be reaching a limit and a 
new unique format may have to be used using far more 
numbers and memory. This is not such a problem for 
POTS telephones which may use local telephone num- 

35 bers or extensions to request connections from 
switches. 

[0007] Internet telephony uses a transient network 
of computers to send discrete packets of data between 
destinations. Unlike POTS telephones, the route the 
-to voice data take may vary over the course of a conversa- 
tion, it is necessary that the network phones themselves 
have the full address information of the destination 
available. 



[0008] According to one aspect of the invention 
there is provided a method of connecting computer net- 
work ip telephones: opening a voice channel from one 
of said ip phones to a voice recognition server; deter- 
mining a name from a speech input sent over the voice 
channel to the voice recognition server; determining an 
ip address from an ip address database corresponding 
to the determined name; opening a data channel from 
the database and transmitting the ip address to one or 
other of said telephones; and routing logic on said one 
or other ip phones using the ip address to establish a 
connection with the other ip phone. 
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[0009] This allows the ip phones to access remote 
resources of voice recognition and a large database 
thereby taking advantage of more powerful resources 
than would be available locally. This is particularly 
advantageous for pervasive computing devices which 
have limited resources. 

[0010] The network phone differs from a normal 
phone in the following manner: it only has a single mul- 
tipurpose button, has no numeric identification on it, and 
plugs into a data network rather than a telephone line 
connected to a switch. It may be a virtual device on a 
screen rather than an actual physical device. The device 
has the capability to set up a voice stream ("telephone 
call") to another similar device (virtual or physical) on 
the same data network. The device receives the 
address (not number) of the other device from a direc- 
tory dialler, to which it will set up a connection whenever 
the single multipurpose button is pressed, so that the 
caller can declare the name (not number) of the person 
to whom a call is required and the directory dialler can 
supply the address to which a connection is to be made. 
Hence the addressing logic resides in the directory dial- 
ler, but the "switching logic" lies in the phone itself, 
which is to say that whereas a telephone connected to a 
switch always makes a connection through the switch, 
the Numberless LAN phone only makes a connection of 
its own initiative through the data network. This is 
already achieved by Internet phones (e.g.. CoolTalk for 
Netscape). The product may make calls to devices 
(phones, Internet phones, other Numberless LAN 
phones) outside of the network in which it is able to 
make connections of its own initiative by using the direc- 
tory dialler as a gateway - however this ability is unlikely 
to effectively differentiate the product, as it is really a 
property of the directory dialler/gateway. 
[0011] Advantageously the voice channel to the 
voice recognition server is opened immediately on acti- 
vation of the said one ip phone. This can be achieved 
when the phone is taken of the hook. An ip socket is 
opened through the voice over ip interface to the remote 
voice recognition server. Since no buttons need be 
pressed by a caller all buttons may be removed from the 
phone interface increasing the ease of use and lowering 
manufacturing costs. 

[0012] The voice recognition server may send a 
voice message requesting the name of the other ip 
phone or user be spoken into the ip phone. The caller 
responds and the spoken name transmitted to the voice 
recognition functionality on the remote server. 
[001 3] According to another aspect of the invention 
there is provided a computer network telephone com- 
prising: voice recognition functionality; a network 
address database functionality; and a routing module; 
wherein the voice recognition functionality will deter- 
mine a name from a spoken name, an ip address will be 
determined from the database using the name, and the 
routing logic module will use the ip address to establish 
a connection with another network telephone. 



[0014] Preferably the routing logic module is an 
integral part of the network telephone and allows the ip 
phone to route calls directly to other ip phones give their 
ip address. 

5 [0015] The voice recognition engine may be pro- 
vided in a remote server and also the ip address data- 
base maybe provided in a remote server. More 
preferably the ip address database and the voice recog- 
nition functionality are provided in the same remote 

10 server so that there is minimum communication time 
between the two functions. 

[001 6] According to a further aspect of the invention 
there is provided a network server comprising: a voice 
recognition engine; an Internet telephony database; a 

(5 network interface; and a routing module; wherein the 
voice recognition engine is adapted to perform recogni- 
tion on a spoken name corresponding to a second net- 
work phone, said spoken name is received through the 
network interface from a first network telephone: an ip 

so address corresponding to the recognised name is 
located in the Internet telephony database and sent 
back to the first network phone through the routing mod- 
ule so that a connection may be established between 
the first and second network phones. 

25 [001 7] The intention of the at least the embodiment 
of the invention is to facilitate the elimination of long tel- 
ephone numbers, diverse and inflexible numbering 
plans, and potentially telephone switches themselves. 
Furthermore is it hoped that telephones with numbered 

30 dialpads will eventually be replaced by telephones with- 
out a dialpad. 

BRIEF DESCRIPTION OF DRAWINGS 

35 [0018] In order to promote a fuller understanding of 
this and other aspects of the present invention, an 
embodiment will now be described, by way of example 
only, with reference to the accompanying drawings in 
which: 

40 

Figure 1 is a schematic representation of two com- 
puter network telephones connected by a computer 
network; and 

45 Figure 2 shows the method used to connect the 
network telephones. 

DETAILED DESCRIPTION OF PREFERRED EMBOD- 
IMENT 

50 

[0019] The embodiment comprises a first and sec- 
ond network telephone 13A.B connected to the server 
10 via a computer network 11 (see Figure 1). The pre- 
ferred network is the Internet but the network can be a 
55 wide area network or a local area network. The server 
10 is connected to the Internet via a network adapter or 
via an Internet gateway, for example in another server 
on the LAN. In the embodiment the first and second net- 
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work telephones 13A.B are connected independently to 
the Internet. 

[0020] A personal computer set up as a network tel- 
ephone typically has a microprocessor, semiconductor 
memory (ROM/RAM), hard disk, and a bus over which 
data is transferred (not shown). Other components of 
the computer are a display, keyboard and mouse (not 
shown). The computer maybe any conventional work- 
station, such as an Aptiva computer, available from IBM 
Corporation. Alternatively, any other form of suitable 
network access device, including the new generation of 
low-cost systems ('network computers' or thin clients' - 
effectively sub-PCs) which are currently being devel- 
oped, maybe employed as the client telephone terminal. 
[0021] Each network telephone is equipped with a 
network adapter card and accompanying software 
including a routing logic interface 14A.B, voice over ip 
interface 16A.B and Internet protocol interface 18A.B. 
The network adapter provides the hardware layer inter- 
face directly to the LAN or Internet gateway. Alterna- 
tively the Internet is accessed using a modem via an 
Internet provider. The operation of a network adapter 
card or modem to provide Internet access is well- 
known, and so will not be described in detail. The rout- 
ing logic module 14 provides the mechanism to select 
the server and other network telephones to receive 
Internet messages. The voice over ip interface 16A.B 
provides the mechanism to convert voice signals to and 
from Internet messages. The Internet protocol module 
provides the mechanism to set up Internet connections 
between the server and the network phone to send 
Internet messages via the connections. Button 24A.B is 
linked to the routing logic interface 14A.B. On activation 
of the button 24A, an Internet message is sent between 
the routing logic module 14A and the directory server 
10. 

[0022] An audio card (not shown), for example 
MWave from IBM Corporation, is connected to the bus 
and to a headset including microphone 20A.B and ear- 
phone 22A.B for audio input and output respectively. 
Alternatively the network phone may have a loud- 
speaker, and built-in microphone, but the use of a head- 
set is preferred to optimise the quality of the audio 
signal produced and actually heard. 
[0023] The network server 10 is based on a con- 
ventional computer work station having a display 
screen, keyboard, microprocessor, ROM/RAM, disk 
storage (not shown). The RISC system/6000 worksta- 
tion, available from the IBM Corporation, is an example. 
The network server 10 is connected to the Internet via 
routing logic module 14C, voice over IP interface mod- 
ule 16C and internet protocol interface 18C. The server 
1 0 comprises voice processing functionality 25 and a IP 
address database 26. 

[0024] The network phone 13A requires routing 
information from the directory server 10. When the but- 
ton 13A is depressed a data message is sent (step 102 
- see Figure 2) to check the server 10 is ready. The ip 



address of the directory server is permanently stored in 
network phone 13A memory and selected by the routing 
logic interface 14A so that the IP interface 18A can set 
up the data channel. Once it is established that the 

5 directory server 10 is ready the voice-over ip interface 
can set up voice channels between the speaker 22A 
and microphone 20A of the network phone and the 
directory server 10 (step 104). The caller speaks the 
name of the recipient intended for the call, i.e. the user 

io of network phone 13B (step 106). The directory server 
10 performs speech recognition on the caller's voice to 
determine the destination of the call (step 108). The 
server 10 then looks up the address of the recipient 
(step 110) and passes the address back to phone 13 

is along the data channel (step 1 1 2). 

[0025] Network phone 13A first "pings" network 
phone 13B to check that it is available (step 114) .the 
"ping" is to check that network phone 13B is not already 
on a call and then connecting to network phone 13B via 

so Voice-over IP (step 1 16). Network phone 13B rings, and 
the user of network 13B can accept the call by pressing 
the button on network phone 13B. The respective users 
may now have a communication over network phones 
13A.B as per a normal POTS call (step 118). 

25 [0026] Another situation arises when network 
phone 13A (or B) makes another request to the Direc- 
tory Server 10 such as a transfer to another phone - 
"transfer [name of user of network phone 13B] to 
[another user's name]". This would temporarily leave 

30 phone 13B "waiting" for the reconnection of phone 1 (or 
another user), or the Directory Server if a transfer is to 
be made to Phones outside the network, tf the request 
is "hang up" (or some shorthand agreed for this such as 
"ok") then phone 1 is instructed by the Directory Server 

35 to send a packet of data to LAN Phone 2 telling it that it 
has hung up. 

[0027] The Directory Server has a T1/E1 interface 
26 to a PBX 28 which is connected to other telephone 
switches via a POTS telephony network. This allows 

40 network phone users to talk to ordinary telephones 
using the Directory Server as a Voice-over IP Gateway 
(the users on ordinary telephones could be referred to 
by their names if the Directory Server was aware of 
them, or by their names and telephone numbers if this 

45 was the first call to their numbers). 

[0028] Although the embodiment has been 
described in terms of the network phone controlling the 
routing of the connection from the first phone to the sec- 
ond phone, it has been envisaged that the directory 

so server can act as a node in the connection of the first 
phone to the second phone. In this case the server 
opens a second channel to the second network phone 
after the ip address has been located and then connects 
the first network phone channel with the second net- 

55 work phone channel. 

[0029] Now that the invention has been described 
by way of a preferred embodiment, various modifica- 
tions and improvements will occur to those person 
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skilled in the art. Therefore it should be understood that 
the preferred embodiment has been provided as an 
example and not as a limitation. 

Claims 5 

1. A method of connecting computer network ip tele- 
phones: 

opening a voice channel from one of said ip >o 
phones to a voice recognition server; 

determining a name from a speech input sent 
over the voice channel to the voice recognition 
server; is 

determining an ip address from an ip address 
database corresponding to the determined 
name; 



opening a data channel from the database and 
transmitting the ip address to one or other of 
said telephones; and 

routing logic on said one or other ip phones 
using the ip address to establish a connection 
with the other ip phone. 



20 



25 



vided in a remote server and also the ip address 
database maybe provided in a remote server. 

7. A computer network telephone as claimed in claim 
6 wherein the ip address database and the voice 
recognition functionality are provided in the same 
remote server. 

8. A computer network telephony server comprising: 

voice recognition functionality; 

network address database functionality; and 

routing logic module; 

wherein the voice recognition functionality will 
determine a name from a spoken name sent to 
the server from a first network telephone, an ip 
address for a second network phone will be 
determined from the database using the name, 
and routing logic will use the ip address to 
establish a connection between the first and 
second network telephone. 



3. 



A method as claimed in claim 1 whereby the voice 
channel to the voice recognition server is opened 30 
immediately on activation of the said one ip phone. 

A method as claimed in claim 2 whereby the ip 
phone is activated when the phone is taken off the 
hook. 35 



4. A computer network telephone comprising: 

voice recognition functionality; 

a network address database functionality; and 

a routing logic module; 

wherein the voice recognition functionality will 
determine a name from a spoken name, an ip 
address will be determined from the database 
using the name, and the routing logic module 
will use the ip address to establish a connec- 
tion with another network telephone. 

5. A computer network telephone as claimed in claim 
4 wherein the routing logic module is an integral 
part of the network telephone and allows the ip 
phone to route calls directly to other ip phones give 
their ip address. 

6. A computer network telephone as claim in claims 4 
or 5 wherein the voice recognition engine is pro- 



40 



45 



so 



5 




6 



EP 1 041 779 A2 




<D 




CD 




a 




co 




sz 




o 


CO 


X 




CD 




03 




O 




O 




> 















"to 




c 


<D 


CO 


TD 






-~ o 


CD 


CM 




CL t- 


To 




O 


D5 
C 




CO 




X 




CM 
CD 




Q) 




CD 




C 




CO 




sz 




o 


CD 


X 


O 


CD 




CD 




O 




O 




> 





7 



